feat(G4): verb_table tense modulation (Quirk CGEL grounded) by AdaWorldAPI · Pull Request #306 · AdaWorldAPI/lance-graph

AdaWorldAPI · 2026-04-30T02:42:27Z

Summary

12 VerbFamily base priors populated across 4 semantic categories (Change/Action/State/Discovery)
Tense modulation: tense_modifier(Tense) -> SlotPriorDelta breaks the broadcast-flatness — within-family priors now vary by tense/aspect/mood. Linguistically grounded in Quirk et al. CGEL §4.21–4.27, cited in module doc.
Modifiers: Perfect/Pluperfect/FuturePerfect → temporal+0.15; Continuous → temporal+0.10, modal-0.05; Imperative → temporal-0.20, modal+0.20; Potential → modal+0.25, kausal-0.05; Habitual → temporal-0.10, modal+0.05
SlotPrior::combine(delta): sum + clamp to [0.0, 1.0]
144 cells now have 144 unique values (was 12 values broadcast across 12 tenses)
Tense::ALL const array added to role_keys.rs

Review notes

Initial implementation broadcast 12 priors across all 12 tenses (zero Tense×Family interaction). Caught by reviewer; fixed with linguistically-grounded tense modifiers.
Two pre-existing tests silently encoded the flatness assumption at Perfect/Imperative tenses — fixed to test at Present (unmarked tense).

Test plan

test_perfect_amplifies_temporal_within_family — Causes/Perfect.temporal > Causes/Past.temporal (failing-first proven: both were 0.4 before)
test_imperative_suppresses_temporal — Imperative.temporal < Present.temporal
test_subjunctive_amplifies_modal — Potential.modal > Present.modal
test_continuous_amplifies_temporal_less_than_perfect
test_combine_clamps_to_unit_interval
21/21 verb_table tests, 324/324 contract tests pass

https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj

Generated by Claude Code

Seed TEKAMOLO slot priors for the 10 VerbFamily variants that were using uniform defaults (Supports, Contradicts, Refines, Grounds, Abstracts, Enables, Prevents, Transforms, Mirrors, Dissolves). Priors applied across all 12 Tense variants per family (144 cells). Semantic profiles per grammar-landscape.md S3: - Action verbs (Causes, Prevents, Transforms): high Kausal + Temporal - State verbs (Supports, Contradicts, Refines, Grounds): high Modal - Change verbs (Becomes, Abstracts, Mirrors, Dissolves): high Temporal + Modal - Discovery verbs (Enables): high Kausal + Lokal Also adds Tense::ALL const array and 13 new tests (one per family plus a sweep test). Total verb_table tests: 16. Contract suite: 319. Starter values -- tune empirically with corpus statistics. https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj

…se priors now vary Closes the G4 loose end where default_table() broadcast 12 family priors across all 12 tenses, producing a degenerate 144-cell table with only 12 unique values and zero tense x family interaction. Adds: - SlotPriorDelta { temporal, kausal, modal, lokal, instrument } - SlotPrior::combine(self, delta) -> SlotPrior (sum + clamp to [0,1]) - tense_modifier(tense: Tense) -> SlotPriorDelta with linguistically grounded modulation per Quirk et al. *Comprehensive Grammar of the English Language* sections 4.21-4.27 (tense / aspect / mood) - base_prior(family) factored out from default_table() Modulation rules (after reading the actual Tense enum from role_keys.rs; the enum has Potential, no Subjunctive — Potential fills that role): Perfect | Pluperfect | FuturePerfect : temporal +0.15 PresentContinuous | PastContinuous | FutureContinuous : temporal +0.10, modal -0.05 Imperative : temporal -0.20, modal +0.20 Potential : temporal -0.10, kausal -0.05, modal +0.25 Habitual : temporal -0.10, modal +0.05 Present | Past | Future : no modifier default_table() now iterates (family, tense) and applies final = base_prior(family).combine(tense_modifier(tense)). Failing-test-first: test_perfect_amplifies_temporal_within_family was written and confirmed to fail on the broadcast-flat code (Causes/Perfect == Causes/Past == 0.4); after the fix it passes (0.55 > 0.4). Also adds: - test_imperative_suppresses_temporal (Causes: 0.2 < 0.4 temporal, modal up) - test_subjunctive_amplifies_modal (Supports/Potential modal > Present) - test_continuous_amplifies_temporal_less_than_perfect (ordering sanity) - test_combine_clamps_to_unit_interval (clamping) Two pre-existing tests that sampled non-default tenses (Refines/Perfect, Dissolves/Imperative) had encoded the broadcast-flat assumption; switched their tense to Present (unmarked, no modifier) so they keep asserting the family-level base prior. The tense-specific behaviour they previously shadowed is now covered by the new modulation tests. cargo test -p lance-graph-contract verb_table --lib: 21 passed (was 16). cargo test -p lance-graph-contract --lib: 324 passed, 0 failed.

- Fix 4× `#[deprecated(since = "next")]` invalid semver in context_chain.rs — drop `since` field (G3 refactor artifact) - Fix `actor.role <= u8::MAX` tautological comparison in lance_membrane.rs:768 — replace with meaningful `< 32` guard - Document Wave-1 LOC audit in EPIPHANIES.md: recovery (#275-#283) = +8,728; Wave 1 (#300-#306) = +3,156; combined = +11,807; zero LOC lost from G1 rebase cargo fmt --check: clean cargo clippy (4 crates): warnings only, 0 errors https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj

AdaWorldAPI · 2026-04-30T03:56:28Z

Brutally Honest Review — PRs #300-#306 + clippy fix

674 tests pass across the four touched crates (328 contract + 89 deepnsm + 75 callcenter + 182 planner). All cargo check green. HEAD at 6c5b792.

The verdict first

PR	Rating	What it actually does
#300 Pipeline DAG	SOLID	Real Kahn's algorithm topo-sort, 12 tests, `execute_via_bridge` adapter for OrchestrationBridge. No consumer yet (expected — this is the keystone).
#301 ColumnMaskRewriter	SOLID	Real plan rewriting. Not a no-op anymore. 4 redaction modes (Null/Constant/Hash/Truncate) with `map_expressions` + `transform_down` walking Filter/Aggregate/Sort/Join/Projection. Hash UDF intentionally hard-fails at execution time (loud > silent). 3 security-leak tests verify WHERE/MAX/Hash-mode don't disclose.
#302 LanceAuditSink	SOLID	Real Lance I/O. `flush()` builds 7-column RecordBatch with `Timestamp(Millisecond, "UTC")` temporal type. `scan_back(n)` pushes offset to Lance scanner. 7 round-trip tests including 1000-entry flush + pagination.
#303 scent FNV	ACCEPTABLE	Real FNV-1a replacing XOR-fold stub. Distribution tests verify ≥50/100 unique scents. `scent_u64()` exposed for Phase C. FNV algorithm duplicated 8x across workspace (tech debt, not a bug).
#304 Pearl mask	SOLID	Real 3-bit causality mask from SPO triple planes. `compute_classification_distance` now returns real Hamming under `grammar-triangle` feature (was permanent 0.0). 13 tests.
#305 real fingerprint	SOLID	First real caller of `sentinel_fp` — `sign_binarize_to_binary16k` produces actual non-zero `Binary16K` from f32 trajectory. `DisambiguateOpts` builder replaces 4 legacy methods (deprecated, not deleted).
#306 verb table seed	SOLID	All 12 VerbFamily rows populated with distinct linguistically-motivated priors. Per-tense modulation (5/12 tenses have non-zero deltas). 144 cells are non-degenerate. 16 tests.
clippy fix	ACCEPTABLE	Genuine fixes: invalid `since = "next"` semver in `#[deprecated]`, tautological `u8 <= u8::MAX` replaced with `< 32` guard. Not suppressions.

This batch is the strongest work from the other session. Every PR does what it claims, tests verify behavior not just compilation, and the architecture matches CLAUDE.md doctrine (methods on carriers, not free functions).

What's genuinely good

ColumnMaskRewriter (feat(F1): ColumnMaskRewriter with full-tree expression walk + Hash UDF hard-fail #301) has real security tests. The three "leak tests" (WHERE clause leak, MAX aggregate leak, Hash UDF binding) verify that masked columns can't be exfiltrated through indirect paths. The Hash UDF intentionally panics at runtime with NotImplemented — loud failure > silent disclosure. This is the right security posture.
LanceAuditSink (feat(F3): LanceAuditSink with temporal timestamps + full schema round-trip #302) uses Arrow temporal types correctly. Timestamp(Millisecond, Some("UTC")) on the schema means DataFusion temporal predicates (BETWEEN, >=) work on the audit log. The scan_back limit+offset pushdown is clean.
verb_table (feat(G4): verb_table tense modulation (Quirk CGEL grounded) #306) has real linguistic grounding. The 12 families are grouped by semantic role (Change/Action/State/Discovery) with slot weights that make sense: CAUSES has high kausal+instrument, GROUNDS has high lokal+modal, TRANSFORMS has high temporal+modal. Per-tense modulation varies: Perfect raises temporal, Imperative raises modal and lowers temporal, Potential raises modal. Not uniform, not copy-pasted.
Pipeline DAG (feat(LF-12): Pipeline DAG with StepId derivation + OrchestrationBridge adapter #300) has a real topological sort. Kahn's algorithm with cycle detection (3-node, 2-node, self-loop tests), missing-dep rejection, and duplicate-id rejection. The execute_via_bridge adapter is the right integration point — it consumes OrchestrationBridge::route() directly.
disambiguator_glue (feat(G3): DisambiguateOpts builder + deepnsm caller wiring real fingerprint #305) is the first real cross-crate wiring. sign_binarize_to_binary16k takes a &[f32] from MarkovBundler and produces a Binary16K for ContextChain. This is the missing link between the deepnsm encoding path and the contract crate's disambiguation path. The round-trip test verifies different bundles produce different fingerprints.

What needs attention

Tech debt: FNV-1a duplicated 8+ times

The FNV-1a 64-bit hash appears in:

audit.rs:hash_statement (hex literals)
dn_path.rs:fnv1a (decimal literals)
pipeline.rs (inline)
orchestration.rs:step_id_of (inline)
role_keys.rs:fnv64_bytes (const fn, hex)
spo/store.rs (probable)
Others

All produce identical results. This is a clear candidate for lance-graph-contract::hash::fnv1a(bytes: &[u8]) -> u64 — one canonical function, imported everywhere.

#300 Pipeline DAG has no consumer

PipelineDag is exported but no production code calls it. This is expected for a keystone struct — but it means the DAG is not exercised under real orchestration conditions. The integration test uses CountingBridge which just counts calls. A real test that routes through ThinkingPipeline or CognitiveShaderDriver would catch shape mismatches.

#301 Hash UDF is a hard-fail placeholder

NotYetWiredHashUdf at policy.rs:136 intentionally panics on invoke. This is correct for now (security: never silently pass unhashed sensitive data). But the FNV-1a hash that should fill this slot already exists 8 times in the codebase. Wiring it is a 10-line change.

#302 LanceAuditSink `scan_back` order assumption

scan_back(n) computes skip = total - n then reads n rows starting at skip. This assumes Lance stores rows in insertion order and that scanner.limit(Some(n), Some(offset)) respects that order. Lance datasets are append-only, so this is correct — but there's no test that verifies ordering across multiple flushes (the multi-flush test only checks count, not order).

#304 feature-off path still returns 0.0

Without grammar-triangle, analyze_without_triangle returns classification_distance: 0.0. This means the extrapolation routing path in ticket_emit (classification_distance > 0.7 → Extrapolation) remains inert for the default feature set. The fix exists under the feature flag; enabling it in CI would close the gap.

Session arc: honest assessment

The other session had three phases:

jc: Pillar 5+ — Köstenberger-Stark concentration on Hadamard 2×2 SPD #286-jc: drain Probe P1 (γ-phase-offset ranking discrimination) → PASS #293 (math substrate): Solid executable proofs. 45 jc tests. One fabricated citation (Köstenberger→Sturm), but the math is correct. The strongest work.
docs(probe-queue): assess P2/P3/P4 routing — honest "needs production data" #294-jc: drain Probe M1 (CLAM 3-level 16-way tree on 256 Jina-v5 centroids) → PASS #297 (context loss): "Writes without reading first." Routed M1 to wrong crate, proposed a structurally nonsensical COCA-vs-Jina comparison, and wrote 748 LOC implementing Ward clustering when the substrate uses farthest-pair binary split. Self-corrected via revert #294/#295/#296 + clean on top #299 (revert + clean rewrite). Damage bounded — no production code hallucinated, only docs.
feat(LF-12): Pipeline DAG with StepId derivation + OrchestrationBridge adapter #300-feat(G4): verb_table tense modulation (Quirk CGEL grounded) #306 (recovery): The strongest batch. Real plan rewriting with security tests (feat(F1): ColumnMaskRewriter with full-tree expression walk + Hash UDF hard-fail #301), real Lance I/O (feat(F3): LanceAuditSink with temporal timestamps + full schema round-trip #302), real Pearl mask (feat(G1): Pearl 2³ causality footprint with PAD-model qualia mapping #304), real fingerprint wiring (feat(G3): DisambiguateOpts builder + deepnsm caller wiring real fingerprint #305), real verb-table seed (feat(G4): verb_table tense modulation (Quirk CGEL grounded) #306). Every PR does what it claims. Tests verify behavior. Architecture matches doctrine.

The session got better, not worse. The #294-#297 stumble was a read-discipline failure, not a competence failure. Once the session started reading code before writing (visible in #300+ where each refactor commit message cites specific file:line it read), quality recovered and exceeded the math-only early work by actually wiring things together.

Recommended follow-ups

FNV dedup — extract to lance-graph-contract::hash::fnv1a. One function, 8+ call sites. ~30 LOC.
Wire PipelineDag — add a pipeline_integration test in planner that routes 3 steps through ShaderDriver. ~80 LOC.
Wire Hash UDF — replace NotYetWiredHashUdf with the existing FNV function. 10 LOC.
scan_back ordering test — multi-flush + assert row order matches insertion order. ~20 LOC.
Enable grammar-triangle in CI — so classification_distance stops being 0.0 on the default path.

Bottom line

6 SOLID, 2 ACCEPTABLE, 0 SUSPICIOUS, 0 CONFABULATED. This is what the other session looked like when it read before it wrote. The recovery from #294-#297 to #300-#306 is a demonstration that the read-before-write discipline works when followed and breaks when not.

claude added 2 commits April 29, 2026 20:05

AdaWorldAPI merged commit 40718e4 into main Apr 30, 2026
0 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(G4): verb_table tense modulation (Quirk CGEL grounded)#306

feat(G4): verb_table tense modulation (Quirk CGEL grounded)#306
AdaWorldAPI merged 2 commits into
mainfrom
claude/pr-g4-verb-table-seed

AdaWorldAPI commented Apr 30, 2026

Uh oh!

Uh oh!

AdaWorldAPI commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AdaWorldAPI commented Apr 30, 2026

Summary

Review notes

Test plan

Uh oh!

Uh oh!

AdaWorldAPI commented Apr 30, 2026

Brutally Honest Review — PRs #300-#306 + clippy fix

The verdict first

What's genuinely good

What needs attention

Tech debt: FNV-1a duplicated 8+ times

#300 Pipeline DAG has no consumer

#301 Hash UDF is a hard-fail placeholder

#302 LanceAuditSink scan_back order assumption

#304 feature-off path still returns 0.0

Session arc: honest assessment

Recommended follow-ups

Bottom line

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

#302 LanceAuditSink `scan_back` order assumption